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Abstract 

In conditional copula models, the copula parameter is deterministically 
linked to a covariate via the calibration function. The latter is of central in- 
terest for inference and is usually estimated nonparametrically. However, when 
a parametric model for the calibration function is appropriate, the resulting 
estimator exhibits significant gains in statistical efficiency and requires smaller 
computational costs. We develop methodology for testing a parametric formu- 
lation of the calibration function against a general alternative and propose a 
generalized likelihood ratio-type test that enables conditional copula model di- 
agnostics. We derive the asymptotic null distribution of the proposed test and 
study its finite sample performance using simulations. The method is applied 
to two data examples. 

Keywords: Constant copula; covariate effects; dynamic copula; local likelihood; 
model diagnostics; nonparametric inference. 
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1 Introduction 



Copulas are an import ant tool for mo deling dependence. The recent development of 
conditional copulas by PattoJ ( 2006 ) widely expands the range of possible applica- 
tions, as it allows covariate adjustment in copula structures and thus enables their 
use in regression settings. Specifically, if X is a covariate that affects the dependence 
between the continuous random variables Yi and Y2, then the conditional joint distri- 
bution of Yi and Y2 given X = x can be written as Hx{yi,y2 \ x) = Cx{Fi\^{yi \ 
x)^F2\x{.y2 I x) I x}, where Fi\x is the conditional marginal distribution of Yi given 
X = X for i = 1,2 and Cx is the conditional copula, i.e. the joint distribution of 
Ui = FixiYi I x) and U2 = -F2z(^2 | x) given X = x. 

When the dependence structure is in the inferential focus, one needs to specify 
a functional model betwe en the covariate X and the copula C^- In the context of a 

(l201l[ ) have studied a nonparametric estimator 



Acar et al. 



parametric copula family, 
of the calibration function ri{X) in 



{Ui,U2) \x 



X ~ 



Cx{ui,U2\d{x) = g ^{r]{x))], 



(1) 



where (7 : — R is a known link function that allows unrestricted estimation for 77. 

It is known that if a parametric model for ri{X) is suitable, then fitting a non- 
p arametric raodel le ads to an unnecessary loss of efficiency. For instance, in Table 1 
in lAcar et al.l (I2OIII ) this loss is illustrated in the case of an underlying linear calibra- 
tion function. Furthermore, parametric formulation of ti{X) yields a much simpler 
conditional copula model that is more convenient for subsequent analysis. Therefore, 
it is of great practical importance to determine whether ri{X) can be reasonably esti- 
mated using a simple para.metric form. While one can construct pointwise confidence 



intervals as in 



Acar et al 



( I2OIII ) and check whether an estimated parametric cali- 
bration function falls within the confidence intervals, such visual inspections are not 
sufficient to make valid inference on the form of the calibration function. One needs 
to construct simultaneous confidence intervals across the covariate range or rigorous 
hypothesis tests for the specification of the calibration function. Here we take the 
latter approach. 

Our development focuses on the hypotheses of the form Ho : "?7(-) is linear in X" 
versus Hi : is not linear in X" under the conditional copula model in ([1]). This 
class of hypotheses includes the important special case of Ho : "?7(-) is constant" 
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versus Hi : "?7(-) is not constant". This is of particular interest because in those 
cases where t] can be reasonably estimated by a constant, one can rely on statistical 
methods developed for the classical copula model. 

However, such hypotheses cannot be tested using the canonical likelihood ratio 
test (LRT) because estimation under the alternative hypothesis is performed nonpara- 
metrically. Exploration of the asymptotic distribution of the ratio te st falls wit h in the 
scope of the generalized likelihood ratio test (GLRT) developed by 



Fan et al. 



(120011 ) 

for testing a parametric null hypothesis versus a nonparametric alternative hypothe- 
sis. Since nonparam etric maxinaum li kelihood estimators are difficult to obtain and 

(l200l[ ) suggested using any reasonable nonparamet- 



may not even exist. 



Fan et al. 



ric estimator under the alternative model. In particular, using a local polynomial 
estim ator to sp e cify t he alternative model of a number of hypothesis testing prob- 



lems, 



Fan et al. 



(|2001[ ) showed that the null distribution of the GLRT statistic follows 
asymptotically a chi-square distribution with the number of degrees of freedom inde- 
pendent of the nuisance parameters. This result, referr ed to as Wilks phenomenon, 
holds for Gaussian white-noise model ( iFan et al.l . 120011), varying-coef ficient models. 



whic h include the regressio n model speci al case (| 



sity (IFan and Zhang 



models ( IZhang et al. 



Fan et al. 



200 ll ). spectral den- 



2004), additive models (IFan and Jiangl . l2005l ) and single- index 



20101). 



We expand the GLRT-based approach to testing the calibration function in con- 
ditional cop ula models. The test procedure employs the nonparametric estimator 
proposed by 



Acar et al. 



( 1201 ll ) when evaluating the local likelihood under the alter- 
native hypothesis. The major contribution is the construction of a rigorous framework 
for such GLRT-based tests in the conditional copula context, which leads to improved 
efficiency when a suitable parametric form can be specified. It is worth mentioning 
that the proposal can easily accommodate the test for an arbitrary parametric form. 
The description of the test, the derivation of its asymptotic null distribution and the 
discussion of practical implementation are included in Section O The finite sample 
performance of the test is illustrated using simulations and two data examples in 
Section [3] and HI respectively. The paper ends with concluding remarks. 
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2 Generalized likelihood ratio test for copula func- 
tions 



Suppose that {(f/n, f/21, Xi), . . . , {Uin, U2n, ^n)} is a random sample from the condi- 
tional copula model ([1]). The hypothesis of interest is 

Ho : Vi-) e fL versus Hi : t]{-) ^ ft, (2) 

where fi = {'?(■) : 3 ao, cti G M such that ri{X) = qq + cliX, VX G X} denotes the 
set of all linear functions on X. 

In what follows, we assume that the density Cx of Cx exists and for simplicity we 
use the notation i{t,ui,U2) = In Cx{ui, U2', g^^{t)}. Furthermore, the first and second 
partial derivatives of i with respect to t are assumed to exist and are denoted by 
ij{t,Ui,U2) = dH{t,Ui,U2)/dP , for j = 1,2. 

2.1 Proposed GLRT for the conditional copula model 

A natural way to approach ([2]) is through the likelihood ratio of the restricted (i.e., 
conditional copula with a linear calibration function) and the full (i.e., conditional 
copula with an arbitrary calibration function) models, or equivalently, through the 
difference 

sup {L„(Hi)}- sup {Ln(Ho)}, 
vi-mL v{-)eh 

where 

n 

L„(Ho) = ^i{ao + aiXi,Uii,U2i), 
1=1 

n 

L„(Hi) = J]^(r7(X,),t/H,t/20- 

i=l 

The supremum of the log-likelihood function under the null hypothesis is given by 

n 

L„(Ho,r/) = J]£(f/(X0,Un,U2i). 

i=l 

where fj{X) = do + aiX, with a = (cio, Si) denoting the maximum likelihood estimator 
of the parameter a = (ao,ai). Under the alternative, the general unknown form of 
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ri{-) adds significant complexity to the c alculation o f the s upremum. We use the non- 



parametric estimator of ?7(-) proposed by lAcar et al.l (120111 ) to define the log-likelihood 



under the full model. Specifically, for each observation Xj in a neighbourhood of an 
interior point x, we approximate ri{Xi) linearly by 

r){Xi) ^ 7]{x)+7]'{x){Xi-x) = Po + Pi{Xi-x), 

provided that ri{x) is twice continuously differentiable. Estimates of (3 = (/3o,/3i), 
and of ri{x) = Po, are then obtained by maximizing a kernel-weighted local likelihood 
function 

n 

£{f3,x) = J]n/3o + - x), Uu, U2i] K,,{X, - x), (3) 

i=l 

where > is a bandwidth parameter controlling the size of the neighbourhood 
around x, K is a symmetric kernel density function and Kh{-) = K{-/h)/h weighs 
the contribution of each data point based on their proximity to x. Similarly, if one 
uses a pth order local polynomial estimator, the local linear approximation in will 
be replaced by Yl^=o (^^i-^i " resulting estimator is given by fjhix) = $q. 

Then we evaluate the log-likelihood function under the alternative hypothesis of ([2]) 
as 

II 

L„(Hi,r)h) = $^mh(Xi),Uii,U2i}. 

i=l 

The difference between the two log-likelihoods allows us to evaluate the evidence 
in the data in favor of (or against) the null model. Hence, the generalized likelihood 
ratio statistic is given by 

\n{h) = L„(Hi, r}h) - L„(Ho, f]). (4) 

While large values of A„,(/i) suggest the rejection of the null hypothesis, we need to 
determine the rejection region for the test. In order to inform the decision in finite 
samples we investigate the asymptotic distribution of the GLRT statistic under the 
null hypothesis. 

2.2 Asymptotic distributions of proposed GLRT statistic 

To facilitate our presentation we introduce the following notation. Let f{x)>0 be 
the density function of X with support X and denote by | Afj the range of the covariate 
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X. Also, denote hj K * K the convolution of the kernel K and define 



/^n = ^ {k{Q) -\j K\t)d^ = \^CK, 



Ck 



h ' 2 



The following result states that the GLRT statistic follows asymptotically a normal 
or equivalently a chi-square distribution in the case of negligible bias, where the mean 
and variance are related to the quantities /x„ and z/^, respectively. The technical 
conditions and proofs are deferred to the Appendix. 

Theorem 1. Assume that the conditions (C1)-(C7) in the Appendix hold and the 
GLRT statistic A„(/i) is constructed from Ij^ with a local linear estimator. Then, as 
h ^ and nh^/"^ — )■ oo, 

V-''\\n{h) -fin + dn) A A^(0, 1), (5) 

where dn = Op{nh^ + n^/^ h'^) . 

Furthermore, if rj is linear or nh^/'^ — > 0, then, as nh^/"^ — )■ oo, 

where r^^ = 2 /i„/z/„. 

It should be noted that when rj is linear, the asymptotic bias dn becomes exactly 
zero, shown in (lA.ip in the appendix, and thus the condition n/i^/^ — »• is not 



nedeed (the optimal bandwidth for estimation is of the order n see 



AcaretaL 



201ll ). More importantly, this facilitates the calculation of the GLRT statistic Xn{h) 



in practice, since one can use directly the b andwidth used fo r estimation, chosen by 
the leave-one-out cross- validated likelihood (lAcar et al.l . l201l[ ). Our simulation study 
in Section 3 provides empirical support for this suggestion. 

Moreover, the asymptotic results in Theorem [1] can be easily extended to the case 
where Xn{h) is based on a pth order local polynomial estimator, b y substituting the 
kerne l function K with its equivalent kernel K* in ck and rx (see iFan and Gijbeld . 
19961 . page 64, for the expression of K*) induced by the local polynomial fitting 
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( iFan et al.l . l200l[ ). The asymptotic chi-square distribution (|6]) continues to hold if 
either r/ is a polynomial of degree p or nh^^^^^'^/'^ — )■ 0, as the asymptotic bias dn = 
Op{nh'^P^'^ + rv^/'^y^^). The practical implication of such an extension is that, if 
the interest is to test a null hypothesis of a polynomial form rj{x) = Y7i=oP^^^^ 
it is recommended to calculate \n{h) using the local polynomial estimator with the 
corresponding degree p. This avoids the possible necessity of undersmoothing in order 
to have the asymptotic bias negligible. 

As pointed out earlier, the hypothesis of r] being constant is a special case of the 
linearity constraint and leads to the classical copula model (i.e., no covariate adjust- 
ment is required). If this hypothesis is of interest, using a local constant estimator, 
i.e., p = 0, to calculate \n{h) may be more appealing (as confirmed by the simulations 
in Section 3) than using a local linear estimator. The latter tends to overfit even with 
large bandwidth when Hq indeed holds, thus resulting in an inflated type I error. 

One can conclude from Theorem [T] that the GLRT is fairly similar to the classical 
likelihood ratio test. The tabulated value of the scaling constant is close to 2 for 
commonly used kernels. For instance, tk = 2.115 for the commonly used Epanech- 
nikov kernel K{u) = 0.75(1 — 'U^)l{|u|<i}. The degrees of freedom (df) ck \X\/h of 
the asymptotic null distribution of the GLRT tends to infinity when /i — )■ 0, due to the 
nonparametric nature of the alternative hypothesis. One can interpret the quantity 
I A" I //i as the number of nonintersecting intervals on X, and thus Ck \'^\/h approx- 
imates the effective number of parameters in the nonparametric estimation. For the 
Epanechnikov kernel with ck = 0.45, the degrees of freedom is given by 0.968 \X\/h. 



3 Simulation Study 

We conduct simulations to evaluate the finite sample performance of the proposed 
test for the linear hypothesis given in ([2]). We consider three simulation scenarios 
corresponding to three calibration functions. 

Mo: Vo{X) = 8, 

Mi: 7]i{X) = 25- 4.2 X, 

Ma: 7]2{X) = 12 + 8sin(0.4X2). 
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The copula used belongs to the Frank family and has the form 

Ciu,,U2\9) = -- In 1 1 + ^ ^^ii-^ ^ I , ee i-oo, oo) \ {0}. 

Since the range of 9 is ]R\{0}, an identity link is used, i.e., 9k{X) = rjk{X) for 
k = 0,1,2. Similar findings (not reported here) were obtained for the simulations 
using the Clayton copula. 

Our Monte Carlo experiment consists of 200 replicated samples of sizes n = 200 
and 500 generated from each model. Specifically, under model we first simulate 
the covariate values Xj ~ U(2,5), i = l,...,n and then, conditional on Xi, the 
uniform pairs {Uii,U2i) are sampled from the Frank family with copula parameter 
9k{Xi) = rjk{Xi) induced by the cahbration model M^, for all = 0, 1, 2. Throughout 
the simulations we have used the Epanechnikov kernel. For each Monte C arlo sample, 

( I2OIII ) is employed 



Acar et al. 



the leave-one-out cross- validated likelihood method of L 
to select, out of 12 pilot values ranging from 0.33 to 2.96 and equally spaced in 
logarithmic scale, the optimum bandwidth h for the local polynomial estimation of 
the calibration function. We have followed the suggestion made in Section 2 and have 
calculated the nonparametric estimator for r] using a local polynomial of the same 
degree as specified by the null hypothesis. For instance, in Table 1, when testing 
Hq : rj = c, we consider a local constant estimator (with p = 0) for rj under the 
alternative model. Subsequently, the GLRT statistic Xn{h) is computed using the 
same bandwidth h that is used for estimation. We also assume that in practice one 
would first test for constant calibration function and, conditional on rejection, would 
test for linear calibration. For this reason, in Table 1 we do not report the results of 
testing Ho : ri{x) = ao + aix when the generating model is Mq. 

One can notice from Tabled] that the rejection rates under the null are very close 
to the target values of the type I error probabilities a G {0.1,0.05,0.01}, for both 
linear and constant nulls (models Mq and Mi). Our approach leads to high power in 
detecting departures from the null, as one can see from the results generated under 
models Mi and M2. For clearer visualization, the entries in the table that correspond 
to power are shown in bold face. 



8 



Table 1. Demonstration of the proposed GLRT for testing the hnear/constant null 
hypothesis Hq at a = 0.10,0.05 and 0.01, respectively. Shown are the rejection fre- 
quencies assessed from 200 Monte Carlo replicates. The sample sizes are n = 200 and 
n = 500, where the generating models are shown in the "True Model" column. Those 
entries in the table reflecting the power of the testing procedure are shown in bold face. 



Null Model 









x) = aQ 


+ aix 




c 


True Model 


n 


.10 


.05 


.01 


.10 


.05 


.01 




200 








.105 


.040 


.020 


Mo 
















500 








.110 


.045 


.005 




200 


.100 


.055 


.005 


.995 


.990 


.955 


Ml 


















500 


.085 


.055 


.010 


1.00 


1.00 


1.00 




200 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


M2 


















500 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 



4 Data Application 



Acaretal 



In thi s section, we apply the GLRT to the two data examples studied in 
( 1201 ll ). Our aim is to check whether a constant copula model or a conditional cop- 
ula model with a linear calibration function fits th ese examples reas onably well, i.e 



whether the nonparametric calibration estimates of 
essary. 



AcaretaL 



(120111 ) are in fact nec- 



4.1 Twin birth data 

This data set contains the birth weights and the gestational age of 450 twin pairs 
from the Matched Multiple Birth Data Set (MMB) of the National Center for Health 
Statistics. Of interest is the dependence between the birth weights (BWjj_^W2Xj2f 
the fir st- and second-born twins given their gestational age GA. We follow 



AcaretaL 



( I2OIII ) and transform the data on the uniform scale, as shown in the left panel of 
Figure [11 and use the Frank family of copulas to model the dependence structure. 
The right panel of Figure [1] shows the maximum likelihood estimates obtained under 
the constant calibration assumption (solid line), linear calibration assumption (long- 
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dash line), the nonparametric estimates with p = (dot-dashed hne), p = 1 (dashed 
hne) and 90% point wise confidence in tervals for the local linear estimates (dotted 
lines), obtained as in lAcar et al.l (1201 if ). 
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Figure 1. Scatterplot of the conditional marginal distributions of birth weights given 
the gestational age (left panel) and the plot of calibration function estimates under the 
Frank copula (right panel): maximum likelihood estimate of the constant calibration 
function (solid line), maximum likelihood estimate of the linear calibration function 
(long-dashed line), local constant estimates (dot-dashed line), local linear estimates 
(dashed line), 90% pointwise confidence intervals for the local linear estimates (dotted 
lines). 

As seen in Figure [H the maximum likelihood estimates under constant and linear 
calibration assumptions are not within the confidence intervals of the local linear esti- 
mates, suggesting that these simple parametric formulations may not be appropriate. 
This empirical observation is confirmed by the GLRT tests, which yielded p-values 
smaller than 10~^ for both tests (test statistics are 13.58 on 3.92 df and 12.95 on 3.36 
df for the constant and linear hypothesis, respectively). 

Thus, we conclude that the variation in the strength of dependence between the 
twin birth weights at different gestational ages, as represented by the nonparametric 
estimates in the right panel of Figure [T] is statistically significant. 
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4.2 Framingham Heart Study data 

This data set comes from the Framingham Heart Study (FHS) and contains the 
log-pulse pressures of 348 subjects at the first two examination periods, denoted 
by log(PPi) and log(PP2); respectively, as well as the change in body mass index 
ABMI between these periods. The left panel of Figure [2] displays the conditional 
marginal distributio ns of the log-puls e pressures given ABMI, which are obtained 



parametrically as in lAcar et al.l ( l201ll ) . 

The estimates of the calibration function are obtained under the chosen Frank 
family using the maximum likelihood estimation with constant and linear calibration 
forms and the nonparametric estimation with p = and p = 1. The results are shown 
in the right panel of Figure |5J 
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Figure 2. Scatterplot of the conditional marginal distributions of the log-pulse pres- 
sures given the change in body mass index (left panel) and the plot of calibration 
function estimates under the Frank copula (right panel): maximum likelihood estimate 
of the constant calibration function (solid line), maximum likelihood estimate of the lin- 
ear calibration function (long-dashed line), local constant estimates (dot-dashed line), 
local linear estimates (dashed line), 90% pointwise confidence intervals for the local 
linear estimates (dotted lines). 

Based on the Figure [2] we suspect that a constant copula model may be appropri- 
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ate. To decide whether the fitted constant copula model is appropriate, we perform 
the GLRT using the local constant estimates at the bandwidth value h = 3.45. This 
bandwidth choice leads to 2.66 df of the chi-square distribution. The difference be- 
tween the log-likelihoods of the alternative and null conditional copula models is 0.91 
and consequently the p-value is 0.514. Thus, we conclude that the change in body 
mass index does not have any significant effect on the strength of dependence between 
the two log-pulse pressures. 



5 Conclusion 

Adjusting statistical dependence for covariates via conditional copulas is an active 
area of research where model fitting and validation are currently in early develop- 
ment. This paper takes a first step towards establishing conditional copula model 
diagnostics by presenting a formal test of hypot hesis for the calib ration function. In- 



spired by the generalized hkelihood ratio idea of iFan et al.l (120011 ) . the proposed test 



uses the local likelihood estimator of 



AcaretaL 



(120111 ) to specify the model under the 



alternative when testing a parametric calibration function hypothesis. The asymp- 
totic null distribution of the test statistic, shown to be a chi-squared distribution with 
the number of degrees of freedom determined by the estimation-optimal bandwidth, 
is used to determine the rejection region in finite samples. Simulations suggest that 
the method has high power of detecting departures from the null model and yields 
the targeted type I error probability. 

The GLRT procedure presented here can be easily adapted to test an arbitrary 
parametric calibration function. Furthermore, the approach can be extended to em- 
ploy other nonparametric estimators, such as smoothing splines, although with ad- 
ditional effort of deriving the asymptotic null distribution. Nevertheless, the asymp- 
totic null distribution may not always be appropriate for determining the rejection 
region in finite samples. While conditional bootstrap is usually used to assess the null 
distribution of the GLRT in regression-based problems, defining a similar bootstrap 
procedure in the conditional copula setting is not straightforward and requires further 
study. 
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A Regularity Conditions and Technical Proofs 

The asymptotic distribution of the GLRT statistic relics on the following technical 
conditions. The conditions (C1)-(C3) are standard in nonparametric estimation and 
the conditions (C4)-(C7) are required to regularize the conditional copula density. 

(CI) The density function f{X) > of the covariate X is Lipschitz continuous, and 
X has a bounded support X . 
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(C2) The kernel function K{t) is a symmetric probability density function that is 
bounded and Lipschitz continuous. 

(C3) The functions rj and have (p+ l)th continuous derivatives, where p = 1 when 
a local linear estimator is used for Xn{h). 

(C4) The functions ii{ri{x),Ui,U2} and £2{^(a;), Mi, M2} exist and are continuous on 
X X (0, 1)^, and can be bounded by integrable functions of Ui and U2- 

(C5) E|{£i(77(x),ui,U2) |x}|'<oo. 

(C6) E{£2(^7(x), ui, U2) I x} is Lipschitz continuous. 

(C7) The function £2(^5^1,^2) < for all t G M, and Ui,U2 G (0,1). For some 
integrable function k, and for ti and ^2 in a compact set, 

\i2iti,Ui,U2) - 4(^2, Ml, ^^2) I < k{Ui,U2)\ti - t2\. 



In addition, for some constants > 2 and ko > 0, j = 1, 2, 3, 

e\ sup |£2(r/(x,X) + m^z.,[/i,f/2)| ^f^'"V(^f^)|^ = 0(l), 

where f]{x, X) = rj{x) + rj\x){X — x). 

Before proving Theorem [H we shall introduce additional notation. Let 7„ = 
1 / \/rih and define 

2 

= ^-^^^£i(r7(X,),t/i„t/2,)ir((X,-x)//i), 
a [x)j[x) 

2 " 

Rn{x) = \\ {^Mx, X,),Uu, U2i) - iliviXi), Uu, U2^)] X K{{X, - x)/h), 

cr [x)J[x) I. J 

where a'^{x) = — E[ £2 {^(x),Ui,U2} | X = x] denotes the Fisher Information for 
ri{x) at any x E X. 
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Recall that ri{x,Xi) = rj{x) + rj'{x){Xi — x), define 

n 

Rnl = ^l(?7(Xfc), [/jfc, [/2fc) RnjXk)-, 

k=l 

n 

Rn2 = -J2^MXk),Uik,U2k)an{Xk)Rn{Xk), 
k=l 

1 " 

^"3 = --J2^MXk),U^k,U2k)RliXk). 

k=l 

and set 

n 71 
j=l j=l 

The Lemma [TH3] are used in our derivations, and their proofs are given at the end of 
this appendix. 

Lemma 1. Under conditions (C1)-(C7), 

Vhix) - T]{x) = {anix) + Rnix)} (1 + Op(l)). 

Remark. Note that, when rj is linear, then Rn{x) directly becomes zero as for each 
i = 1, . . . ,n 

fj{x, Xi) = flo + ciix + ai{Xi — x) = rj{Xi). (A.l) 
This is clearly also the case when r] is constant. 

Lemma 2. Under conditions (C1)-(C7), as h Q and nh^/"^ — )■ oo 
T„, = iA-(0)Elf-'(X)] + 1 '''"'g- <.('KX,). Uu, U„) 

xKt,{X,-Xk)+ Op{h-'/^), 

15 



n2 



iE|r'mi/K'(t)dt-i;E 

i<j 



£i(r7(Xi),Uii,Ua) 
a2(X0f(Xi) 



To introduce Lemma [3l we first restate a proposition in lde Jong (119871 ). where the 
notation is adapted to ours. Let Xi,X2, ... be independent variables, and Wijn{-, ■) 
Borel functions such that W{n) = X]i<j<n '^i<j<n Wijn{Xi, Xj), and Wij = Wijn{Xi, Xj] 
+ Wjin{Xj, Xi), where the index n is suppressed in Wij. Following de Jong (1987, 
Definition 2.1), Wn is called clean if the conditional expectations of Wij vanish: 
= a.s. for all i,j < n. 



Proposition 3.2 (Ide Jong . 119871 ) Let W{n) be clean with variance z/*, if Gj, Gu 
and Giv be of lower order than , then 



AiV(0,l), 



n — )■ oo, 



where 



Gi = ^^^= E {^{'^l'^l)'r^{'^lWl)+^{Wl^Wl;)}, 

l<i<j<n l<i<j<k<n 

Giv = Yl MW^JWikWl,Wlk) + E{Wi,WuWk,Wki) + E{W,kWuW,kW,i)}. 

l<i<j<k<l<n 



We now define the following U-statistic, 

MviXj), U,j, f/2,) h{v{X,),Uu, U2i) 

X {2Kh{X, - Xi) - Kh * Kh{X, - Xi)]. (A.2) 



Lemma 3. Under conditions (C1)-(C7), Wn defined in liA . 2^) is clean and W{n) 
N{0,u*), as h^O and nh^/^ oo, where u* = 2 \ \2K - K * K\\l E[f~\X)]. 
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Proof of Theorem [Tl To provide a general framework, we use ri{Xk) and ri{Xk) to 
denote the true value under the null hypothesis and its maximum likelihood estimator, 
respectively. Then, the GLRT statistic can be written as 

n 
k=l 

-imXk), Uik, U2k) - iiviXk), Uik, U2k)}] 

Here A2n corresponds to the canonical likelihood ratio statistic and it is Xin{h) that 
governs the asymptotic distribution of Xn{h). 

To derive the asjTiiptotic distribution of Xin{h), first approximate l{fjh{Xk), Uik, U2k) 
around ri{Xk) 

n 

Xin{h) ^ 5^^i(r7(Xfc),f/i,,f/2fe){r/,,(X,)-r/(X,)} 
fc=i 

1 " 

+ - ^2{v{Xk),U,,,U2k) {fih{Xk)-r]{Xk)y. 

k=l 

Applying Lemma [Hand Lemma |2] yields 

- = - h-'E[f-\X)] |k(0) - J K2(t)dt/2| 



Rnl + Rn2 + Rn3 + Op {u ^) + Op(/l" 



-1/2^ 



By calculating of the leading terms Rn2 and Rns, one can show that 

"1,2 r 

Rnl = ^i^^^^-^)' ^i'^' U^kWiXk) J e K{t)dt{l + Op(l)) = Opin^'^h'' 



k=l 

,2 



~^"2 = X a2(Xfc)/(Xfc) — ^ (^fc)^o(l + Op(l)) = Op(n / /i ), 
-i?„,3 = — i5V'(X)V(X)a;o(l + Op(l)) = 0,(n/i^), 

o 
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where = / / t^{s + tfK{t)K{s + t) ds dt. Thus, 

Rn^ - {Rnl - Rn2) = Op{nh^ + n^'^K"). 

This results in 

where Wn is as defined in ( ]A.2I) . Applying Lemma[3l we arrive at W{n) — > A^(0, u*), 
where = 2 \ \2K - K * K\\l E[f-^{X)\. Hence, 

V-^/\\ln{h) -fln + dn) A N{0, 1), 

where z/„ = (4/i)~^z/*. For the asymptotic null distribution of Xn{h), this result can 
be re-written as 

'^n - A2„) -fin + dn + X^n} ^ N{0, 1). 

Since = 0^(1), it vanishes compared to Ai„(/i) = Op{h^^) and we obtain 

z/-i/2(A„(/i)-/i„ + t/„) A Ar(0,l). 

For the second result, note that the distribution A^(a„, 2a„) is approximately same 
as the chi-square distribution with degrees of freedom a„, for a sequence a„ — )■ cxd. 
Letting a„ = 2yU^/z/„ and = 2/i„/i/„, we have 

{2an)-'/^{rKXn{h) - a„) A N{0, 1), 

provided that (i„ vanishes. □ 

Additional Technical Details 
Proof of Lemma [H Define 

b = j-\f3o-v{x),h{f3,-r]'{x))f, 

so that each component has the same rate of convergence. Then, we have 

/3o + l3i{Xi -x) = r/(x, Xi) + -fnb^Zi,^, 



where Zi^x — (1, {Xi — x)/h)'^. The local log-hkelihood function can be re-written in 
terms of 6, 

n 

£{b) = J2 Kn{x, X^) + ^nb^z^,,, Uu, U2i)MXi - x). 

i=l 

Note that b = 7~^(/So — r]{x),h0i — rj'{x))Y maximizes C{b). It also maximizes 
following normalized function, 

n 

1=1 

which can be written as 

n 

C*{b) = h^r^Y.^Mx,X,),Uu,U2;)b^Zi,xKn{X,-x) 

i=l 
2 ^ 

+ E Xi) + mjzi,,, Uu, U2i) {b^z,,xf Kn{Xi - x) 

^ i=i 

n 
i=l 

1 " 

+ 2-'b^ \^-J2Hv{x,Xi) + mJzi,x,Uu,U2i) Zi,xzl^ KhiXi-x)^- 
In the following, we will show that 

n 

n~^^l2{f]{x,Xi) + mjzi^:c, Uu, U2i) Zi^^^zl^ Kh{Xi -x) = -A + Op(l), 

i=l 

where A = cr'^{x)fx{x) [ ) , with iJii — j fK{t)dt, and Op(l) is uniform in 

X e A" and ||6|| < mo, for some fixed constant mo > 0. To show this, we need the fol- 
lowing smoothness result. Let An{x, m) = i2{f](x, X) + rri^Zx, Ui, U2) z^z^ Kh{X — 
x), with ||m|| < 1. Then, under the conditions (C1)-(C7), we can show that 

\An{xi,m-i) - An{x2.,m2)\ < k{X,Ui,U2){\\mx - m^W + \xi - X2\) 
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for some integrable function k{X, Ui, U2)- Thus, using the triangle inequahty, 



1 

- ^2{ri{x, X,^ + mjzi^^, Uu, U2i) Zi^^zf^ Kh{Xi - x) - (- A) 

i=l 

1 " I 

< - ^ I {^2(^(3;, Xi) + mjzi^^, Uu, U2i) - i2iv{x, Xi), Uu, U2i)}zi^^zl^Kh{X, 
1 I 

- VU2(^(x,X,),[/i„[/2i) -4(r/(X,),f/i„f/2,)} X Kh{Xi 

Tl I t * ' 

= 1 

1 " 

- (2{v{X^), Uu, f/2.)^.,.<. Ki,{Xi -x)- E{e2{v{X), Uu U2)Z,Z 



1=1 



+ sup 

ri,x 



+ sup 

T],X 



i=l 



xKh{X-x)\x} + E{i2iv{X),Ui,U2) z,zl K\iX -x)\x} + A 

for 1] in a compact set and x & X. The first sum goes to zero by the previous argument 
and the Dominated Convergence theorem. Similarly, the second sum converges to zero 
provided that hrS^^"^^/^ = 0(1) and ||6|| < mo, for some fixed constant mo > 0. The 
first part in the last term goes to zero with probability one by the uniform weak law 
of large numbers and the second part vanishes by direct calculation. We thus obtain 

r(6) = Wnix)-2-^b^Ab{l + Op{l)), 

uniformly for x E X, where 

n 

Wn{x) = ^nY,(iMx,Xi),Uu,U2i) Z,,,K {{X, - x) / h) . 
i=l 



Using the quadratic approximation lemma (jFan and Gijbeld . Il996l p. 210), 

b = A-'WM + o,{l), 

provided that Wn is a stochastically bounded sequence of random vectors. The first 
entry of b directly yields the result, i.e. 



In {Vh{x)-V{x)} 



In 



i=l 



(7^(x)/(x 

+ Yl ^*)' f^i- f^2.) - £i(r/(X,), f/i„ U2^)]K{{X., - x)/h) 
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^iHXi), Uu, U2i)K{{X, - x)/h) 

(l + Op(l))- 



1=1 



Proof of Lemma [H Note that 

The approximation of the first term 

ll t K (0) = ft-'A-(O) E /-'(A-) + o,(ft-/'^) 

yields the first result. We can decompose T„2 = r„2i + r„22, where 

/u — 1 

Kh{X,-Xk)Kh{X,-Xk)]. 
We deal with T„2i and Tn22 separately. For T„2i, note that 

= j^v E ''.('K-yo. c/u.. "'"^ A-'(o) 



The first sum can be shown to be 

WE-W (a2(A-,)/(.Y,))^ A-(0)+o,(A )=0,(„ ft ). 
Therefore, let 



n n — 1 ^ — ^ 

^ ^ i<k 



{a2(X,)/(X,)}2 ^ - {a2(X,)/(X,)}^ 



and the second sum becomes (K, + o(l))/2 + Op ( n'^/^/i-^) + ojh-^/^). The decom- 
position theorem for U-statistics ( iHoeffdingl . Il948[ ) allows us to show that Var{Vn) = 
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as follows. First note that the leading term of K is -h"^ E f~^(X) J K'^{t)dt. 
Hence, as n/i — > oo and /i — > 0, we obtain 

Similarly, we can decompose T„22 = 2n22i + 2^222 with 



Kh{Xi-Xk)KniXj-Xk)}, 



For A; 7^ i, j, define 



It can be easily shown that Var{n ^ Ylk^ij Qijk,h) = 0{n ^). Then, 

7;22i = 2n-2(n-2) Y^^^i^li^i)^ U2i)k{r]{X^), C/i,-, C/2,-) nQijk,h\Xi, Xj)+Op{h-'/^), 



i<j 

where 



EiQijk,h\Xi,Xj) = - {ha\Xi)f{Xi)}-^ J K{t) K{{Xj-Xi)/h)dt. 

It is also easy to show Var{Tn222) = O (n~^/i~^), implying r„222 = 
Combining r„2i, T„22i and r„222 yields 
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Proof of Lemma [31 Recall that 

W{n) = n-^h^'^Y.^a\X,)f{X,)}-HMX^),U^,,U2,)^l{rl{X^,Uu,U2^) 

{2Kh{Xj - X,) - * MX, - Xi)}. 
We shall show that Wn satisfies conditions in Proposition 3.2. Let 

where 
and 

61 = 2Kh{X, - X,){a\Xi)f{X,)}~', b2{i,j) = z), 

bs{z,j) = Kf, * Kf,{X, - X,){a\Xi)f{Xi)}-^, b,{z,j) = b^U,i). 

Thus we can write W{n) = VFj^, and W{n) is clean directly follows from the 

i<j 

first Bartlett identity. For the variance of W{n), note that Var{W{n)) = X]t<j -^(^jj)- 
Thus we calculate E[{Bn{iJ)ii{e{Xi), Uu, U2i) ii{0{Xj), Uij, U2j)}^]. To simplify our 
presentation, let in = ii{9{Xi),Uii,U2i) and denote the m-fold convolution at t by 
K{t, m) = K * ■ ■ ■ * K{t). Through direct calculations, we obtain 



^2/2 / Y — Y 



h^{a^iX,)f{X,)y V h 

( f 2/ V \T^2 f X2 — Xi 



a\X2)K' , f{X2)dX2 \ f{X,)dX 



h^J [J ' \ h 

4 /-/'ra / ^2(^^)^^^^)^.(^)^^^(^^)^^^(^_^Q(^)) 



/i J (t2(Xi) 
iir(0,2)Er^(X)(l + O(/i)). 



23 



Similarly, 



E{bl{i,j) il 




= 4/i-^i^(0,2)^/-i(X)(l + O(/i)), 






= h-^K{'dA)Er\X){l + 0{h)), 


E{bl{i,j) £l 




= h-^K{QA)Er\X){l + 0{h)), 


E{b,it,jMt,j)il 




= 4/1-1^(0, 2)E/-i(X)(l + 0(/i)), 










^l) 


= 2h-'K{0,3)Ef-\X){l + O{h)), 


E{b2{iJMiJ) 4 




= 2h-'K{0,3)Er\X){l + O{h)), 






= 2h-'K{0,3)Ef-\X){l + O{h)), 


E{bs{ij)b,{ij) el 




= h-'K{0A)Er\X){l + O{h)). 



Thus, 

E[B^{z,j)el i^] = h-'{16K{0,2) - 1QK{0,3) + iK{0,4)}Ef-\X){l + 0{h)). 
The leading term of n'^h f^.] yields 

V* = 2{4X(0, 2) - 4X(0, 3) + X(0, 4)}£;/-^(X) = 2 | |2X - X * X| |^ Ef-\X). 

For the condition on G/, note that ^(61(1,2)^11^12)^ = ^(63(1, 2)^11^12)^ = 0{h-^). 
Then E{W^^) = n-^h^O{h^), which impUcs Gj = 0{n-^h-^) = o(l). Similarly, 
the condition on Gu can be verified by noting that E{Wi2Wi^) — 0(^(1^1*2)) = 
0(n~^/i~^). Thus, Gii — 0{n~^h~^) — o(l). For the last condition we need to check 
the order of E{Wi2W23WuW4i). Calculations for few terms yield, 

Eiblil,2)bli2,3)bli3A)bli^A) iU2 ^'i O = 0{h-') 
£;(&?(!, 2)6^(2, 3)6?(3, 4)6^(4, l)f,2 £'3^ £;^) = 0(h~') 

E{bl{l,2)bi{2,3)bl{3A)bl{^,l)iU2^3 = 0{h-') 
£;(6?(1, 2)6^(2, 3)6^(3, 4)6^(4, l)f,^ ^3^^^) = 0(0 
£;(6^(1, 2)6^(2, 3)6^(3, 4)6^(4, l)£f 424242) = 0(0. 



Since terms with other combinations will be of the same order, we conclude that 

E{Wi2W23WuW4i) = n-^h^O{h-^) = 0{n-%, 
and Giv = 0{h) = o(l). This completes the proof. 
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